[1] 丁世飞,杜威,张健,等.多智能体深度强化学习研究进展[J].计算机学报,2024,47(07):1547-1567.
Ding Shifei, Du Wei, Zhang Jian, et al. Research Progress on Multi-Agent Deep Reinforcement Learning [J]. Journal of Computer Science and Technology, 2024, 47(07): 1547-1567.
[2] 段勇,徐心和.基于多智能体强化学习的多机器人协作策略研究[J].系统工程理论与实践,2014,34(05):1305-1310.
Duan Yong, Xu Xinho, Research on Multi-Robot Cooperation Strategies Based on Multi-Agent Reinforcement Learning [J]. Systems Engineering Theory & Practice, 2014, 34(05): 1305-1310.
[3] 许宏鑫,吴志周,梁韵逸.基于强化学习的自动驾驶汽车路径规划方法研究综述[J].计算机应用研究,2023,40(11):3211-3217.DOI:10.19734/j.issn.1001-3695.2023.03.0131.
Xu Hongxin, Wu Zhizhou, Liang Yuyi. A Review of Path Planning Methods for Autonomous Vehicles Based on Reinforcement Learning [J]. Computer Applications Research, 2023, 40(11): 3211-3217. DOI: 10.19734/j.issn.1001-3695.2023.03.0131.
[4] 邹启杰,蒋亚军,高兵,等.协作多智能体深度强化学习研究综述[J].航空兵器,2022,29(06):78-88.
Zou Qijie, Jiang Yajun, Gao Bing, et al. A Review of Cooperative Multi-Agent Deep Reinforcement Learning Research [J]. Aviation Weapons, 2022, 29(06): 78-88.
[5] 赵立阳,常天庆,褚凯轩,等.完全合作类多智能体深度强化学习综述[J].计算机工程与应用,2023,59(12):14-27.
Zhao Liyang, Chang Tianqing, Chu Kaixuan, et al. A Review of Fully Cooperative Multi-Agent Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2023, 59(12): 14-27.
[6] 熊丽琴,曹雷,赖俊,等.基于值分解的多智能体深度强化学习综述[J].计算机科学,2022,49(09):172-182.
Xiong Liqin, Cao Lei, Lai Jun, et al. A Review of Value Decomposition-Based Multi-Agent Deep Reinforcement Learning [J]. Computer Science, 2022, 49(09): 172-182.
[7] Peter S, Guy L, Audrunas G, et al. Value-Decomposition Networks for Cooperative Multi-Agent Learning Based on Team Reward[J/OL]. arXiv: 2085-2087, 2018.
[8] Tabish R, Mikayel S, Christian Schroeder de W, et al. Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[J]. Journal of Machine Learning Research, 2018: 4292-4301.
[9] Kyunghwan S, Daewoo K, Wan Ju K, et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning[C]//Proceedings of the International Conference on Machine Learning. 2019: 5887-5896.
[10] Yaodong Y, Jianye H, Ben L, et al. Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning[J], Computing Research Repository, 2020.
[11] Yu N, Hengxu Z, Lei Y. MA-MIX: Value Function Decomposition for Cooperative Multiagent Reinforcement Learning Based on Multi-Head Attention Mechanism.[C], Adaptive Agents and Multi-Agent Systems, 2024: 2402-2404.
[12] Tabish R, Gregory F, Bei P, et al. Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[C]//Advances in Neural Information Processing Systems. 2020, 33.
[13] Wei D, Shifei D, Lili G, Jian Z, Chenglong Z, Ling D, et al. Value Function Factorization with Dynamic Weighting for Deep Multi-Agent Reinforcement Learning[J], Information Sciences, 2022, 615: 191-208.
[14] Shifei D, Xiaomin D, Jian Z, Lili G, Wei D, Chenglong Z, et al. Multi-Agent Policy Gradients with Dynamic Weighted Value Decomposition[J], Pattern recognition, 2025: 111576-111576.
[15] Jerker D. Reinforcement Learning Leads to Risk Averse Behavior[C]//Proceedings of the Annual Meeting of the Cognitive Science Society. 2008, 30(30).
[16] Shanqi L, Yujing H, Runze W, et al. Adaptive Value Decomposition with Greedy Marginal Contribution Computation for Cooperative Multi-Agent Reinforcement Learning[J], Computing Research Repository, 2023: 31-39.
[17] Bei P, Tabish R, Christian Schroeder de W, et al. FACMAC: Factored Multi-Agent Centralised Policy Gradients[C]//Conference on Neural Information Processing Systems. 2021.
[18] Ryan L, Yi W, Aviv T, et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments[C]//Advances in Neural Information Processing Systems. 2017.
[19] Jianyu S, Stephen A, Peter B. Value-Decomposition Multi-Agent Actor-Critics[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2021, 35(13): 11352-11360.
[20] Ashish V, Noam S, Niki P, et al. Attention is All You Need[C]//Advances in Neural Information Processing Systems. 2017, 30: 5998-6008.
[21] Navid N, Fan HH, Sean S, et al. Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning[C]//International Conference on Learning Representations. 2021.
[22] William LH. The Graph Neural Network Model[M]//Graph Representation Learning. 2020: 51-70.
[23] Songchen F, Shaojing Z, Ta L, et al. QTypeMix: Enhancing Multi-Agent Cooperative Strategies Through Heterogeneous and Homogeneous Value Decomposition[J], Neural Networks, 2025, 184.
[24] Zhitong Z, Ya Z, Wenyu C, et al. Sequence Value Decomposition Transformer for Cooperative Multi-Agent Reinforcement Learning[J], Information Sciences, 2025, 720.
[25] Yali D, Lei H, Meng F, et al. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning[C]//Advances in Neural Information Processing Systems. 2019, 32: 4403-4414.
[26] Tonghan W, Jianhao W, Yi W, et al. Influence-Based Multi-Agent Exploration[C]//International Conference on Learning Representations. 2019.
[27] Maxime T, Nicolas B, Jae-Yun J. Joint Intrinsic Motivation for Coordinated Exploration in Multi-Agent Deep Reinforcement Learning[J], Computing Research Repository, 2024: 2522-2524.
[28] Zifan L, Shibo C, Jun Z. Individual Contributions As Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning[J], Computing Research Repository, 2024.
[29] 殷昌盛,杨若鹏,朱巍,等.多智能体分层强化学习综述[J].智能系统学报,2020,15(04):646-655.
Yin Changsheng, Yang Ruopeng, Zhu Wei, et al. A Review of Multi-Agent Hierarchical Reinforcement Learning [J]. Journal of Intelligent Systems, 2020, 15(04): 646-655.
[30] Anuj M, Tabish R, Mikayel S, et al. MAVEN: Multi-Agent Variational Exploration[C]//Advances in Neural Information Processing Systems. 2019.
[31] Zhiwei X, Yunpeng B, Bin Z, et al. HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism[J/OL]. Computing Research Repository, 2023: 11735-11743.
[32] Jiechuan J, Zongqing L. Learning Fairness in Multi-Agent Systems[J/OL]. arXiv, 2019.
[33] Pu F, Junkang L, Size W, et al. Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks[C], RSJ International Conference on Intelligent Robots and Systems, 2024: 642-649.
[34] Xuechen M, Hankz H Z, Chen, et al. Hierarchical Task Network-Enhanced Multi-Agent Reinforcement Learning: Toward Efficient Cooperative Strategies[J], Neural Networks, 2025, 186.
[35] Jian H, Siyang J, Seth Austin H, et al. Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning[C]//International Conference on Learning Representations. 2023.
[36] Mikayel S, Tabish R, Christian Schroeder de W, et al. The StarCraft Multi-Agent Challenge[J/OL]. Computing Research Repository, 2019.
|